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RECOMBINANT DNA EXPRESSION VECTORS 



Field of the Invention 

This invention relates to expression vectors containing a DNA 
sequence from the human cytomegalovirus major immediate early gene, 
5 to host cells containing such vectors, to a method of producing a 

desired polypeptide by using vectors containing said sequence and to 
the use of said DNA sequence. 

Background to the Invention 

The main aim of workers in the field of recombinant DNA technology 
10 is to achieve as high a level of production as possible of a 

particular polypeptide. This is particularly true of commercial 
organisations who wish to exploit the use of recombinant DNA 
technology to produce polypeptides which naturally are not very 
abundant. 

15 Generally the application of DNA technology involves the cloning of 

a gene encoding the desired polypeptide, placing the cloned gene in 
a suitable expression vector, transfecting a host cell line with the 
vector, and culturing the transfected cell line to produce the 
polypeptide. It is almost impossible to predict whether any 

20 particular vector or cell line or combination thereof will lead to a 

useful level of production. 

In general, the factors which significantly affect the amount of 
polypeptide produced by a transfected cell line are: 1. gene copy 
number, 2. efficiency with which the gene is transcribed" and the 
25 mRNA translated, 3. the stability of the mRNA and A. the efficiency 

of secretion of the protein. 

The majority of work directed at increasing expression levels of 
recombinant polypeptides has focussed on improving transcription 
initiation mechanisms. As a result the factors affecting efficient 
30 translation are much less well understood and defined, and generally 

it is not possible to predict whether any particular DNA sequences 
will be of use in obtaining efficient translation. 
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Attempts at investigating translation have consisted largely of 
varying the DNA sequence around the consensus translation start 
signal to determine what effect this has on translation initiation 
(Kozak M. Cell 41 283-292 (1986)). 

5 Studies involving expression of desired heterologous genes normally 

use both the coding sequence and at least part of the 
5' -untranslated sequence of the heterologous gene such that 
translation initiation is from the natural sequence of the gene. 
This approach has been found to be unreliable probably as a result 

0 of the 'hybrid nature* of the 5' -untranslated region and the fact 

that the presence of particular 5-untranslated sequences can lead to 
poor initiation of translation (Kozak M. Procl. Natl. Acad. Sci. 83 
2850-2854 (1986) and Pelletier and Sonenberg Cell 40 515-526 
(1985)). This variation in translation has a detrimental effect on 

5 the amount of the product produced. 

Previous studies (Boshart et al Cell 41 521-530 (1985) and Pasleau 
et al, Gene 38 227-232 (1985); Stenberg et al , J. Virol 49 (1) 
190-199 (1984); Thorns en et al Proc. Natl. Acad. Sci. USA 81 659-663 
(1984) and Foecking and Hofstetter Gene « 101-105 (1986)) have used 

0 sequences from the upstream region of the hCMV-MIE gene in 

expression vectors. These have, however, solely been concerned with 
the use of the sequences as promoters and/or enhancers. Spaete and 
Mocarski (J. Virol 56 (1) 135-143, 1985) have used a PstI to PstI 
fragment of the hCMV-MIE gene encompassing the promoter, enhancer 

5 and part of the 5 '-untranslated region, as a promoter for expression 

of heterologous genes. In order to obtain translation the natural 
5 '-untranslated region of the heterologous gene was used. 

In published European Patent Application No. 260148, a method for 
the continuous production of a heterologous protein is described. 
0 The expression vectors constructed contain part of the 

5 '-untranslated region of the hCHV-MIE gene as a stabilising 
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sequence. The stabilising sequence is placed in the 5' -untranslated 
region of the gene encoding the desired heterologous protein i.e. 
the teaching is again that the natural 5 '-untranslated region of the 
gene is essential for translation. 

5 Summary of the Invention 

In a first aspect the invention provides a vector containing a DNA 
sequence comprising the promoter, enhancer and substantially 
complete 5 T -untranslated region including the first intron of the 
major immediate early gene of human cytomegalovirus. 

10 in a preferred embodiment of the first aspect of the invention, the 

vector includes a restriction site for insertion of a heterologous 
gene. 

The present invention is based on the discovery that vectors 
containing a DNA sequence comprising the promoter, enhancer and 

15 complete 5 '-untranslated region of the major immediate early gene of 

the human cytomegalovirus (hCMV-MIE) upstream of a heterologous gene 
result in high level expression of the heterologous gene product. 
In particular, we have unexpectedly found that when the hCMV-MIE 
derived DNA is linked directly to the coding sequence of the 

20 . heterologous gene high levels of mRNA translation are achieved. 

This efficient translation of mRNA is achieved consistently and 
appears to be independent of the particular heterologous gene being 
expressed. 

In a second aspect the invention provides a vector containing a DNA 
25 sequence comprising the promoter, enhancer and substantially 

complete 5 f -untranslated region including the first intron of the 
major immediate early gene of human cytomegalovirus upstream of a 
heterologous gene. 



The hCMV-MIE derived DNA according to the second aspect of the 
30 invention may be separated from the coding sequence of the 
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heterologous gene by intervening DNA such as for example by the 
5' -untranslated region of the heterologous gene. Advantageously the 
hCMV-MIE derived DNA may be linked directly to the coding sequence 
of the heterologous gene. 

5 In a preferred embodiment of the second aspect of the invention, the 

invention provides a vector containing a DNA sequence comprising the 
promoter, enhancer and substantially complete 5* -untranslated region 
including the first intron of the hCMV-MIE gene linked directly to 
the DNA coding sequence of the heterologous gene. 

Preferably the hCHV-MIE derived sequence includes a sequence 
identical to the natural hCMV-HIE translation initiation signal. It 
may however be necessary or convenient to modify the natural 
translation initiation signal to facilitate linking the coding 
sequence of the desired polypeptide to the hCHV-MIE sequence, i.e. 
by introducing a convenient restriction enzyme recognition site. 
For example the translation initiation site may advantageously be 
modified to provide an Ncol recognition site. 

The heterologous gene may be a gene coding for any eukaryotic 
polypeptide such as for example a mammalian polypeptide such as an 
20 enzyme, e.g. chymosin or gastric lipase; an enzyme inhibitor, e.g. 

tissue inhibitor of metalloproteinase (TIMP); a hormone, e.g. growth 
hormone; a lymphokine, e.g. an interferon; a plasminogen activator, 
e.g. tissue plasminogen activator (tPA) or prourokinase; or a 
natural, modified or chimeric immunoglobulin or a fragment thereof 
25 including chimeric immunoglobulins having dual activity such as 

antibody-enzyme or antibody-toxin chimeras. 

According to a third aspect of the invention there is provided host 
cells transfected with vectors according to the first or second 
aspect of the invention. 



10 



15 
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The host cell may be any eukaryotic cell such as for example plant, 
or insect cells but is preferably a mammalian cell such as for 
example CHO cells or cells of myeloid origin e.g. myeloma or 
hybridoma cells. 

5 In a fourth aspect the invention provides a process for the 

production of a heterologous polypeptide by culturing a transfected 
cell according to the third aspect of the invention. 

In a fifth aspect the invention provides the use of a DNA sequence 
comprising the promoter, enhancer and substantially complete 
10' 5 '-untranslated region including the first intron of the hCttV-MIE 

gene for expression a heterologous gene. 

In a preferred embodiment of the fifth aspect of the invention the 
hCHV-MIE derived DNA sequence is linked directly to the DNA coding 
sequence of the heterologous gene. 

15 Also included within the scope of the invention are plasmids pCMGS, 

pHT.l and pEE6hCMV. 



Brief Description of the D rawings 

The present invention is now described, by way of example only, with 
reference to the accompanying drawings in which 

20 Figure 1: shows a diagrammatic representation; of plasmid pSVLGS.l 

Figure 2: shows a diagrammatic representation of plasmid pHT.l 

Figure 3: shows a diagrammatic representation of plasmid pCMGS 

Figure 4: shows the complete sequence of the promoter-enhancer 

hCMV-MIE including both the first intron and a modified 

25 translation 'start* site 

Figure 5: shows a diagrammatic representation of plasmid pEE6.hCMV 
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Detailed Description of the Embodiments 
Example 1 

The Pst-lm fragment of hCMV (Boshart et al Cell 41 521-530 (1985) 
Spaete & Mocarski J. Virol 56 (1) 135-143 (1985)) contains the 
5 promoter-enhancer and most of the 5* -untranslated leader of the MIE 

gene including the first intron. The remainder of the 
5' untranslated sequence can be recreated by attaching a small 
additional sequence of approximately 20 base pairs. 

Many eukaryotic genes contain an Ncol restriction site (S'-CCATGG-S' ) 
10 overlapping the translation start site, since this sequence 

frequently forms part of a preferred translation initiation signal 
5 , ACCATGPu-3 t ♦ .The hCMV-MIE gene does not have an Ncol site at the 
beginning of the protein coding sequence but a single base-pair 
alteration causes, the sequence both to resemble more closely the 
15 "Kozak" concensus initiation signal and introduces an Ncol 

recognition site. Therefore a pair of complementary oligonucleotides 
were synthesised of the sequence: 

GTCACCGTCCTTGACAC 

1 1 1 i Illtl ITU III i 

ACGTCAGTGGCAGGAACTGTGGTAC 
20 which when fused to the Pst-lm fragment of hCMV will recreate the 

complete 5 '-untranslated sequence of the MIE gene with the single 
alteration of a G to a C at position -1 relative to the translation 
initiation codon. 

This synthetic DNA fragment was introduced between the hCMV Pst-lm 
25 promoter-enhancer leader fragment and a glutamine synthetase (GS) 

coding sequence by ligation of the Pst-lm fragment and the synthetic 
oligomer with Ncol digested pSV2.GS to generate a new plasmid, pCMGS 
(The production of pSV2.GS is described in published International 
Patent Application No. W0 8704462). pCMGS is shown in Figure 3. 
30 pCMGS thus contains a hybrid transcription unit consisting of the 

following: the synthetic oligomer described above upstream of the 
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hCMV-MIE promoter-enhancer (where it serves merely as a convenient 
Pstl - Ncol "adaptor") , the hCMV-MIE promoter and the complete 5* 
untranslated region of the MIE gene, including the first intron, 
fused directly to the GS coding sequence at the translation 
5 initiation site. 

pCMGS was introduced into CHO-KI cells by calcium phosphate mediated 
transfection and the plasmid was tested for the ability to confer 
' resistance to the GS-inhibitor methionine sulphoximine (MSX) . The 
results of a comparison with pSV2.GS are shown in Table 1. 

10 it is clear that pCMGS can confer resistance to 20 M MSX at a 

similar frequency to pSV2.GS, demonstrating that active GS enzyme is 
indeed expressed in this vector. 



Table 1 

Results of transfection of GS-expression vectors into CHO-KI cells 

15 Vector no. colonies/lO 6 cells 

resistant to 20uM MSX 



pSV2.GS 32 
pCMGS 17 
Control 0 



20 Example 2 

The TIMP cDNA and SV40 polyadenylation signal as used in pTIMP 1 
Docherty et al (1985) Nature 318, 66-69, was inserted into pEE6 
between the unique Hindlll and BamHI sites to create pEE6TIMP. pEE6 
is a bacterial vector from which sequences inhibitory to replication 

25 i n mammalian cells have been removed. It contains the XmnI to Bell 

portion of pCT54 (Emtage et al 1983 Proc. Natl. Acad. Sci. USA 80, 
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3671-3675) with a pSP64 (Melton et al 1984: Nucleic Acids. Res. 12, 
7035) polylinker inserted in between the HindlH and EcoRI sites. 
The BamH I and Sail sites have been removed from the polylinker by 
digestion, filling in with Klenow enzyme and religation. The Bell 
5 to BamHI fragment is a 237 bp SV40 early poly adenylat ion signal 

(SV40 2770 to 2533). The BamHI to the Bgll fragment is derived from 
pBR328 (375 to 2422) with an additional deletion between the Sail 
and the Aval sites (651 to 1425) following the addition of a Sail 
linker to the Aval site. The sequence from the Bgll to the finnl 
10 site originates from the p-lactamase gene of pSP64. 

The 2129 base-pair Ncol fragment containing the hCMV MIE 
promoter-enhancer and 5* untranslated sequence was isolated from 
pCMGS by partial Ncol digestion and inserted at the Ncol site 
overlapping the translation initiation signal of TIMP in pEE6.TIMP 
to generate the plasmid pHT.l (shown in Figure 2). 

A GS gene was introduced into pHT.l to allow selection of permanent 
cell lines by introducing the 5.5K Pvul - BamHI fragment of pSVLGS.l 
(figure 1) at the BamHI site of pHT.l after addition of a synthetic 
BamHI linker to Pvul digested pSVLGS.l to form pHT.lGS. In this 
plasmid the hCMV-TIMP and GS transcription units transcribe in the 
same orientation. 

pHT.l GS was introduced into CHO-Kl" cells by calcium-phosphate 
mediated transfection and clones resistant to 20yM MSX were isolated 
2-3 weeks post- trans feet ion. TIMP secretion rates were determined 
by testing culture supernatants in a specific two site ELISA, based 
on a sheep anti TIMP polyclonal antibody as a capture antibody, a 
mouse TIMP monoclonal as the recognition antibody, binding of the 
monoclonal being revealed using a sheep anti mouse IgG peroxidase 
conjugate. Purified natural TIMP was used as a standard for 
calibration of the assay and all curves were linear in the range of 
2 - 20ng ml" 1 . No non-specific reaction was detectable in CHO-cell 
conditioned culture media. 
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One cell line GS.19 was subsequently recloned, and a sub-clone GS 

8 

19-12 secretes TIMP at a very high level of 3 x 10 

molecules/cell/day. Total genomic DNA extracted from this cell line 
was hybridised with a TIMP probe by Southern blot analysis using 
standard techniques and shown to contain a single intact copy of the 
5 TIMP transcription unit per cell <as well as two re-arranged plasmid 

bands). This cell line was selected for resistance to higher levels 
of MSX and in the first selection a pool of cells resistant to 500uM 
MSX was isolated and recloned . The clone GS-19.6(500)14 secretes 
S x 10 9 molecules TIMP/cell/day. The vector copy-number in this 
10 cell line is approx. 20 - 30 copies/cell. Subsequent rounds of 

selection for further gene amplification did not led to increased 
TIMP secretion. 

Thus it appears that the hCMV-TIMP transcription unit from pHT.l can 
be very efficiently expressed in CH0-KI cells at approximately a 
15 single copy per cell and a single round of gene amplification leads 

to secretion rates which are maximal using current methods. 

Example 3 

In order to test whether the hCMV-MIE promoter-enhancer-leader can 
be used to direct the efficient expression of other protein 
20 sequences, two different but related plasminogen activator coding 

sequences (designated PA-1 and PA-2) were introduced into CH0-KI 
cells in vectors in which the protein coding sequences were fused 
directly to the hCMV sequence. 

In both these cases, there is no Ncol site at the beginning of the 
25 translated sequence and so synthetic oligonucleotides were used to 

recreate the authentic coding sequence from suitable restriction 
sites within the translated region. The sequence of the modified 
hCMV translation-initiation signal as used in pHT.l was also built 
into the synthetic oligonucleotide which then ended in a Pst-1 
30 restriction site. The Pst-lm fragment of hCMV was then inserted at 

this site to create the complete promoter-enhancer-leader sequence. 
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The hCK\T-plasminogen activator transcription units were introduced 
into CHO-KI cells after inserting a GS gene at the unique BamHl site 
as above and MSX resistant cell lines secreting plasminogen 
activator were isolated. 

5 the secretion rates of the best initial transfectant cell lines in 

each case are given in Table 2. From this it is clear that the hCHV 
promoter-enhancer leader can also be used to direct the efficient 
expression of these two plasminogen activator proteins. 



Table 2 

10 Secretion rates of the different plasminogen activator proteins from 

transfectant CHO cell lines. 

Plasminogen activator Molecules secreted /cell/day 

PA - 1 5.5 x 10 7 

PA - 2 1.1 x 10 8 



15 Example 4 

pEE6hCHV was made by ligating the Pst-lm fragment of hCMV, Hindlll - 
digested pEE6 and the complementary oligonucleotides of the sequence: 

GTCACCGTCCTTGACACGA 
111 lliimiHiilltl ' 
ACGTCAGTGGCAGGAACTGTGCTTCGA 



20 cDNA encoding an immunoglobulin light-chain was inserted at the 

EcoRI site of pEE6.hCKV such that the hCMV-MIE promoter-enhancer 
leader could direct expression of the cDNA and a selectable marker 
gene containing the SV40 origin of replication was inserted at the 
BamHI site of each plasmid. 
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This plasmid was transfected into COS-1 monkey kidney cells by a 
standard DEAE-dextran transfection procedure and transient 
expression was monitored 72 hours post transfection. Light chain 
was secreted into the medium at at least lOOng/ml indicating that 
light chain can indeed be expressed from a transcription unit 
5 containing the entire hCMV-KIE 5 '-untranslated sequence up to but 

not including the translation initiation ATG, followed by 15 bases 
of natural 5 '-untranslated sequence of the mouse immunoglobulin 
light-chain gene. 
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CLAIMS 

1. A vector containing a DNA sequence comprising the promoter, 
enhancer and substantially complete 5* -untranslated region 
including the first intron of the hCOT-KIE gene. 

5 2 A vector according to claim 1 wherein the vector includes a 

restriction site for insertion of a heterologous gene. 

3. A vector containing a DNA sequence comprising the promoter, 
enhancer and substantially complete 5* -untranslated region 
including the first intron of the hCMV-MIE gene upstream of a 

10 heterologous gene. 

4. A vector according to Claim 3 wherein the hCMV-KIE DNA is 
linked directly to the DNA coding sequence of a heterologous 
gene. 

• 

5. A vector according to Claim 4 wherein the hCMV-MIE DNA includes 
15 a translation initiation signal. 

6. A host cell transfected with a vector according to any of the 
preceeding claims. 

7. A process for the production of a heterologous polypeptide by 
culturing a host cell according to Claim 6. 

20 8. The use of a DNA sequence comprising the promoter, enhancer and 

substantially complete 5* -untranslated region including the 
first intron of the hCMV-MIE gene for expression of a 
heterologous gene. 

9. The use of a DNA sequence according to Claim 8 wherein the 
25 hCMV-MIE derived DNA is linked directly to the coding sequence 

of the heterologous gene. 



10. Plasmids pCMGS, pEE6.hCMV and pHT.l 
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T 

DNS p t AM 



set S h £1 

£ t 2 31 

111 1 2 - 

// 



/ 



CCATG6TGTCAAGGACGGTGACTGCAGTGAATAATAAAATGTGTGTTTGTCCGAAATAC6 

_ ' _4-__ +— + f- — + 60 

GGTACCACAGTTCCTGCCACTGACGTCACTTATTATTTTACACACAAACAGGCTTTATGC 

CGTTTTGAGATTTCTGTCGCCGACTAAATTCATGTCGCGCGATAGTGGTGTTTATCGCCG 

| + 1 1 ' — "" 1" 120 

GCAAAACTCTAAAGACAGCGGCTGATTTAAGTACAGCGCGCTATCACCACAAATAGCGGC 

C 
1 
a 
1 

ATAGAGATGGCGATATTGGAAAAATCGATATTTGAAAATATGGCATATTGAAAATGTCGC 

X21 + * + + + + 18 

TATCTCTACCGCTATAACCTTTTTAGCTATAAACTTTTATACCGTATAACTTTTACAGCG 



c 
o 
R 
V 

CGATGTGAGTTTCTGTGTAACTGATATCGCCATTTTTCCAAAAGTGATTTTTGGGCATAC 

181 + + + + + + 240 

GCTACACTCAAAGACACATTGACTATAGCGGTAAAAAGGTTTTCACTAAAAACCCGTATG 

E 
c 
o 
R 
V 

GCGATATCTGGCGATAGCGGCTTATATCGTTTACGGGGGATGGCGATAGACGACTTTGGT 

241 + + + + + + 300 

CGCTATAGACCGCTATCGCCGAATATAGCAAATGCCCCCTACCGCTATCTGCTGAAACCA 

GACTTGGGCGATTCTGTGTGTCGCAAATATCGCAGTTTCGATATAGGTGACAGACGATAT 

301 + + + + + + 360 

CTGAACCCGCTAAGACACACAGCGTTTATAGCGTCAAAGCTATATCCACTGTCTGCTATA 

C BH N C 

f aa si 
r le i a 

1 11 11 
/ 

GAGGCTATATCGCCGATAGAGGCGACATCAAGCTGGCACATGGCCAATGCATATCGATCT 

361 + + + + + + 420 

CTCCGATATAGCGGCTATCTCCGCTGTAGTTCGACCGTGTACCGGTTACGTATAGCTAGA 



Fig. AA 
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S C BH 
s f aa 
p r le 
1 1 11 

ATACATTGAATCAATATTGGCCATTAGCCATATTATTCATTGGTTATATAGCATAAATCA 

421 + + + + + + 480 

TATGTAACTTAGTTATAACCGGTAATCGGTATAATAAGTAACCAATATATCGTATTTAGT 

S C BH 

s £ aa 

p r le 

1 1 11 

ATATTGGCTATTGGCCATTGCATACGTTGTATCCATATCATAATATGTACATTTATATTG 

481 + + + + + + 540 

TATAACCGATAACCGGTAACGTATGCAACATAGGTATAGTATTATACATGTAAATATAAC 

H 

1 M S 
n m p 
c e e 

2 1 1 
GCTCATGTCCAACATTACCGCCATGTTGACATTGATTATTGACTAGTTATTAATAGTAAT 

541 + + + + + + 600 

CGAGTACAGGTTGTAATGGCGGTACAACTGTAACTAATAACTGATCAATAATTATCATTA 

CAATTACGGGGTCATTAGTTCATAGCCCATATATGGAGTTCCGCGTTACATAACTTACGG 

6 Q1 + + + + + + 660 

GTTAATGCCCCAGTAATCAAGTATCGGGTATATACCTCAAGGCGCAATGTATTGAATGCC 

B A A 

0 ha 

1 a t 

! 2 2 

TAAATGGCCCGCCTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAATAATGACGT 

ggl + + + + + 720 

ATTTACCGGGCGGACCGACTGGCGGGTTGCTGGGGGCGGGTAACTGCAGTTATTACTGCA 

A A 
h a 
a t 
2 2 

ATGTTCCCATAGTAACGCCAATAGGGACTTTCCATTGACGTCAATGGGTGGAGTATTTAC 

721 + + + + + + 780 

TACAAGGGTATCATTGCGGTTATCCCTGAAAGGTAACTGCAGTTACCCACCTCATAAATG 

B N 

g * 
1 ® 

1 1 
GGTAAACTGCCCACTTGGCAGTACATCAAGTGTATCATATGCCAAGTACGCCCCCTATTG 

731 j h H — + "* + 

CCATTTGACGGGTGAACCGTCATGTAGTTCACATAGTATACGGTTCATGCGGGGGATAAC 



Fig.4B 
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A A B 
ha g 
at 1 
2 2 1 

ACGTCAATGACGGTAAATGGCCCGCCTGGCATTATGCCCAGTACATGACCTTATGGGACT 

841 + + + + + + 900 

TGCAGTTACTGCCATTTACCGGGCGGACCGTAATACGGGTCATGTACTGGAATACCCTGA 

S 

n DNS 
a set 
B aoy 
1 111 

// 

TTCCTACTTGGCAGTACATCTACGTATTAGTCATCGCTATTACCATGGTGATGCGGTTTT 

901 + + + + + + 960 

AAGGATGAACCGTCATGTAGATGCATAATCAGTAGCGATAATGGTACCACTACGCCAAAA 

GGCAGTACATCAATGGGCG^GGATAGCGGTTTGACTCACGGGGATTTCCAAGTCTCCACC 

961 + + + + + + 1020 

CCGTCATGTAGTTACCCGCACCTATCGCCAAACTGAGTGCCCCTAAAGGTTCAGAGGTGG 

A A B 
ha a 
at 4 n 

2 2 1 
CCATTGACGTCAATGGGAGTTTGTTTTGGCACCAAAATCAACGGGACTTTCCAAAATGTC 

1021 +— + + — + + 1080 

GGTAACTGCAGTTACCCTCAAACAAAACCGTGGTTTTAGTTGCCCTGAAAGGTTTTACAG 

GTAACAACTCCGCCCCATTGACGCAAATGGGCGGTAGGCGTGTACGGTGGGAGGTCTATA 

108 i + + + + + + 1140 

CATTGTTGAGGCGGGGTAACTGCGTTTACCCGCCATCCGCACATGCCACCCTCCAGATAT 

BH 

BssS G A 

apia s h 

nlAc u a 

2211 1 2 

TAAGCAGAGCTCGTTTAGTGAACCGTCAGATCGCCTGGAGACGCCATCCACGCTGTTTTG 

H41 + + + + + + 1200 

ATTCGTCTCGAGCAAATCACTTGGCAGTCTAGCGGACCTCTGCGGTAGGTGCGACAAAAC 

N 

B D BCGsSX 

b s gfdpam 
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